Recoding the Variables
To create the tables and charts we need, we usually need to ‘recode’ some of the variables. In OfficeReports this is done by creating derived variables. OfficeReports has standard functionality to merge and group categorical variables, to create periods from date variables and to create intervals and quantiles from numeric variables. Besides that, you can create just the variables you want by using ‘Formulas’.
All derived variables are based on formulas. These formulas can be changed just by editing or by using the 'Formula Editor'. Here you can also read more about how to refer to the variables and categories in expression and about what operators you can use. It is even possible to create conditional expressions.
It is important to realize that derived variables just like original variables contain observations for the cases (respondents). So it is not possible to calculate NPS by trying to detract the category 'Detractors' from category 'Promoters'. Click here to see how you calculate NPS.
Below we describe how to create some of the different kind of derived variables:
In OfficeReports, we do not change the original variables (red icon). We might need them again when importing more data. So we start by making a copy of the variable we want to recode. Right-click this variable and choose 'Copy' from the context menu or click 'New Variable->Copy selected variable'. The copy (blue icon) will be in the bottom of the list. Now click the '+' in front of the variable so you can see the categories. Now you can merge the categories you need to merge by selecting them, right-clicking them and choose 'Merge'. It is also possible to click the 'Categories' button.
If you have multiple variables you need to recode the same way, click the 'Create other variables like this one' button and select the other variables that need recoding. They will all be generated for you.
Many data sets contain numeric data, which it is often relevant present as intervals or categories. For example, a numeric variable containing numbers from 15-90 can be organized into intervals, or a numeric variable containing the numbers ‘1’, ‘2’, ‘3’, ‘4’, ‘5’ and ‘99’ which can be organized into exact 6 categories.
Mark a numeric variable and you will see 4 buttons on the right. Here we will explain what they can be used for:
In our example, ‘Age’ is one of the variables. Say, for instance, that we are interested in creating four different age groups. The following shows how to create these into groups:
-
Locate the numerical variable. Click on it and to the right choose ‘Create Interval Variable’ on the right. Select for instance 3 intervals:
-
A new variable will now appear in the tree-structure with a blue icon, signaling it is a derived categorical variable. OfficeReports will give this type of new derived variable the same text as the original. The name of the new variable is the same as the original added ‘Intervals’.
-
Variable name and text are editable in the text boxes on the right-hand side.
-
To alter the intervals, we highlight each category that has appeared under the ‘Age Intervals’ variable and change the formula that is written in the bottom box on the right-hand side. Click `Save´ when changing a formula.
-
If you need extra categories, these can be added using ‘Categories àNew Category’. Enter a category name (interval) and add a formula.
-
We are now ready to use the new variable for reporting or in combination with other variables.
The same way as we create intervals, we can also create quantiles:
When used in a table or chart definition, the median, quartiles, quintiles, percentiles etc. will be recalculated and only use the cases being reported.
The Net Promoter Score (NPS) is computed as the difference between two sets of percentages (most commonly, the proportion of people to rate a brand as and 8 to 10 less the proportion to rate the brand as a 6 or less)
If you have a numeric NPS variable in your dataset containing numbers from 0 to 10, you can create a NPS variable by clicking the button 'Create NPS Variable'. You will get a derived categorical variable containing the 3 categories 'Detractors', 'Passives' and 'Promoters'. They have values -100, 0 and 100 so the 'Mean' in the table equals the NPS score. OfficeReports will write 'NPS' instead of 'Mean' in the table if the variable name contains 'NPS'.
In case the original variable is categorical and contains these 3 categories, the following rules must be followed to calcculate the NPS score in tables and charts:
1. The NPS variable must include ‘NPS’ as a part of the variable name
2. The NPS variable must contain the following categories and values:
3. Promoters must have value = 100
4. Passives must have value = 0
5. Detractors must have value = -100
With these variable settings, You can check 'NPS' in the table definition to add the NPS score to the table..
Quite often a dataset contains the results for e.g. a scale question, but without labelling. In that case the variable is imported as a numerical variable. We can add the labels by clicking 'Create Categories', and then change the texts of the categories of the derived variable.
This video shows an example: From Numeric to Categorical Variable
The same way as we create intervals and quantiles, we also create periods. Select a 'date' Variable and click the 'Create Periods' button:
The resulting derived variable contains categories which divide the dates into the requested periods.
Sometimes you might want to aggregate a number of questions to show an average of these.
Create a new variable, give it a name and a text, and select the type ‘Aggregate’. Then, click ‘Set Content’ and select all the variables which define the aggregate variable.
The aggregate variable shows the sum from the contained variables, divided by the number of Variables.
This video show how to define a derived aggregate variable: Video: Aggregate Variable
You can merge Categorical Variables either by selecting them and right-clicking ->Merge or by clicking ‘New Variable’à’Merge Variables’. Depending whether the variables contain the same categories or different categories the following will happen:
Merging variables containing identical categories
When the two variables ‘Variable 1’ and ‘Variable 2’ contains identical categories – for example A, B and C, and the tow variables are merged OfficeReports will produce a new variable containing the categories A, B and C. Each of the categories is based on an ‘or’ condition/formula:
- A: (Variable 1 = A or Variable 2 = A)
- B: (Variable 1 = B or Variable 2 = B)
- C: (Variable 1 = C or Variable 2 = C)
The merge of variables containing identical categories can as well be a merge of many variables
Merging variables containing categories which are not identical
When merging two or more variables containing not-identical categories – for example Variable 1 – ‘Male’/’Female’ and Variable 2 – ‘East’/’West’, then the new merged variable will contain the following categories – and related formulas:
- ‘Male’ and ‘East’
- ‘Male’ and ‘West’
- ‘Female’ and ‘East’
- ‘Female’ and ‘West’
The selected order matters, so if we first selected Variable 2 followed by Variable 1 this would have been the result:
- ‘East’ and ‘Male’
- ‘West’ and ‘Male’
- ‘East’ and ‘Female’
- ‘West’ and ‘Female’
A multiple response questions is mostly imported as many variables, one variable for each answer choice. OfficeReports supports multiple response questions, all we have to do is group them, just like in SPSS. By clicking 'Find groups by name', OfficeReports will create groups for all variables that have similar labels. So "Q1_1, Q1_2 and Q1_3" will be grouped into variable "Q1". If the results from this functionality are not as expected, please use 'Group" and choose the variables you want to group yourself:
In the variable window press ‘New Variable’à’Group Variables’ and a window pops up from where we can choose variables to group.
Based on three examples we will describe the detailed features of the function:
The example dataset contains two variables ‘Gender’ (two categories) and ‘Department’ (five categories). If we select these two variables to define the Group Variable, the output will be a derived variable containing the 5+2 categories:
The test dataset contains four variables regarding relations to other departments. Each variable contains one category, ‘1’ = ‘a relation’.
We can produce a derived variable based on these four variables, but as the category name in each variable is the same it is relevant to replace it by the name of the underlying variable:
The test dataset contains four variables regarding Microsoft Office training courses.
The category ‘1’ means participation in the individual course and ‘0’ not participated.
Observe
If a variable contains the two categories ‘0’ and ‘1’ OfficeReports will automatically give these categories values corresponding to their names.
We now want to group those who have participated in the courses into a new derived variable containing 5 categories:
- Outlook
- Word
- Excel
- PowerPoint
- ‘Missing’ = did not participate in any of these
In the variable window press ‘New Variable’à’Group Variables’ and select the 4 course variables. Then check the ‘Exclude categories with the value ‘0’’, and press ‘Create Group Variable’.
OfficeReports will now produce a new derived variable like this:
-
As we have excluded all categories with the value ‘0’ each of the variables now contains one category; ‘1’.
-
OfficeReports detects the equal names and replace the similar names with the variable names – in this case ‘Outlook’, ‘Word’ etc.
-
Additionally, OfficeReports will add an extra category ‘Missing’. This category is defined by all the all the excluded categories – in this case (Outlook=0 AND Word=0 AND Excel=0 AND PowerPoint=0)
Read more in the video: Video: Grouping 0/1 variables
Click 'New Variable->Create Variable', and a new variable is created. If you created a categorical variable, you can add categories and define formulas for them. If you created a numerical variables you can edit the formula immediately. Use the 'Formula Editor' to create the formulas.
Sometimes it is useful to organize open text into categories in order to work with the data quantitatively. If this does not require an advanced coding of the open-ended text OfficeReports can easily organize the open-ended text into categories:
-
Locate the text variable (in our case ‘Comment’). Click on it and to the right and choose ‘Create Categories:
-
A new variable will now appear in the list with a blue icon containing as many categories as we have unique texts in the original numerical variable.
Observe
OfficeReports only looks at the first 256 characters to decide if a text is ‘unique’..
After having the open-ended texts organized into categories OfficeReports can produce a simple categorization of the open-ended text by using the feature (‘Categories’à’Merge Categories’)
Using the Advance Category Settings we can create a banner variable that can contain any data we want. This way we can add benchmark columns in the banner (described below the picture).
Select a category of a derived variable and click ‘Advanced View’ on the right, which opens up a group of new features.
See how to use this functionality in the example in the Filtering chapter.
Example
The scene: We are creating reports for each Team in every Department and we want to create a table which shows all the teams of the department a team belongs to. We also want to add columns containing the results of the whole company, of the whole department and of last year.
The Report Filter is set to category "2020" of variable Year. We select a specific department as Hierarchy1 Filter and a specific team as Hierarchy2 Filter.
To do this, we make a copy of the Teams variable and add the Categories 'The Company', 'The Department' and 'Last Year'.
We have defined Year "2020" as Report Filter
The formula
Content:
Either observations defined by the report/hierarchy filter – or – defined by formula.
Ignore Filters:
One or more filters on different levels might be ignored.
Result Display:
For Automated report productions you can exchange the category name with the category names from a Hierarchi filter. If i.e. Hierarchi 2 is ‘Department’, and we actually are producing a report for the H&R department, then we can replace a static name with the name of actual report/department name.
Show difference:
Instead of showing an actual value, we can show the difference to one of the other categories in the variable. If i.e. 2018 shows ‘58%’ and 2017 shows ‘48%’, we can choose to display the difference ‘10%’.
One of the features in OfficeReports which makes this process smooth is ‘Create other variables like this one’. This feature relates to derived categorical variables, so to find the feature we must first have produced at least one derived categorical variable.
Mark a derived categorical variable and this feature emerges:
A few examples explain how this feature works:
10 numeric variables contain numbers from 1 – 99 which we need organized as derived category variables containing the two categories ‘1 – 49’, and ’50 – 99’.
-
Mark one of the numeric variables and use the feature ‘Create Interval Variable’ to produce a derived categorical variable containing two categories ‘1 – 49’, and ’50 – 99’
-
Mark the newly derived categorical variable and press the button ‘Create other variables like this one’
-
A window pops up showing all numeric variables, which potentially could be turned into similar, derived categorical variables.
-
Select the 9 similar numeric variables, and OfficeReports will start producing the 9 derived categorical variables each containing the two categories ‘1 – 49’, and ’50 – 99’.
15 categorical variables contain the same categories:
- ‘Strongly Agree’
- ‘Agree’
- ‘Disagree’
- ‘Strongly Disagree’
We need to organize these as derived category variables each containing the two categories:
- ‘Strongly Agree’ or ‘Agree’
- ‘Disagree’ or ‘Strongly Disagree’
-
Mark one of the categorical variables and use the feature ‘Copy Selected Variable’.
-
Press the button ‘Categories’à’Merge Categories and produce the two categories:
-
‘Strongly Agree’ or ‘Agree’
-
‘Disagree’ or ‘Strongly Disagree’
-
-
Mark the newly derived categorical variable and press the button ‘Create other variables like this one’.
-
A window pops up showing all numeric variables, which potentially could be turned into similar, derived categorical variables.
-
Select the 14 similar categorical variables, and OfficeReports will start producing the 14 derived categorical variables each containing the two categories.
Observe
The derived categorical variable is always based on a ‘parent’ variable. This means that the categories relate to a parent variable. Now what if some categories relate to one parent variable and other to other variables? How can OfficeReports then ‘Create other variables like this one’? The answer is this: OfficeReports will detect only one variable as parent. This means that only formulas and formula elements related to the parent will change dynamically when we ‘Create other variables like this one.
This short video introduces some of the options: Creating New Variables